DyDa: Dynamic Data Warehouse Maintenance in a Fully Concurrent Environment

نویسندگان

  • Xin Zhang
  • Elke A. Rundensteiner
چکیده

Data warehouses (DW) are an emerging technology to support high-level decision making by gathering information from several distributed information sources (ISs) into one materialized repository. In dynamic environments such as the web, DWs must be maintained in order to stay up-to-date. Recent maintenance algorithms tackle this problem of DW management under concurrent data updates (DU), whereas the EVE system is the rst to handle (non-concurrent schema changes) (SC) of ISs. However, the concurrency of schema changes by diierent ISs as well as the concurrency of interleaved SC and DU still remain unex-plored problems. In this paper, we propose a solution framework called DyDa that successfully addresses both problems. The DyDa framework detects concurrent SCs by the broken query scheme and connicting concurrent DUs by a local timestamp scheme. The two-layered architecture of the DyDa framework separates the concerns for concurrent DU and concurrent SC handling without imposing any restrictions on the autonomy nor on the concurrent execution of the ISs. This DyDa solution is currently being implemented within the EVE data warehousing system.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Optimization Strategies for Data Warehouse Maintenance in Distributed Environments

Data warehousing is becoming an increasingly important technology for information integration and data analysis. Given the dynamic nature of modern distributed environments, both source data updates and schema changes are likely to occur autonomously and even concurrently in different data sources. Current approaches [31, 5] to maintain a data warehouse in such dynamic environments sequentially...

متن کامل

A Compensation-based Approach for Materialized View Maintenance in Distributed Environments

Data integration over multiple heterogeneous data sources has become increasingly important for modern applications. The integrated data is usually stored in materialized views (MV) to allow better access, performance and high availability. MV must be maintained after the data sources change. In a loosely-coupled environment, such as the Data Grid, the data sources are autonomous. Hence the sou...

متن کامل

افزایش سرعت نگهداری افزایشی دید با استفاده از الگوریتم فاخته

Data warehouse is a repository of integrated data that is collected from various sources. Data warehouse has a capability of maintaining data from various sources in its view form. So, the view should be maintained and updated during changes of sources. Since the increase in updates may cause costly overhead, it is necessary to update views with high accuracy. Optimal Delta Evaluation method is...

متن کامل

A Transactional Approach to Parallel Data Warehouse Maintenance

Data Warehousing is becoming an increasingly important technology for information integration and data analysis. Given the dynamic nature of modern distributed environments, both source data and schema changes are likely to occur autonomously and even concurrently in different sources. We have thus developed a comprehensive solution approach, called TxnWrap, that successfully maintains the ware...

متن کامل

Exploiting Versions for On-line Data Warehouse Maintenance in MOLAP Servers

A data warehouse is an integrated database whose data is collected from several data sources, and supports on-line analytical processing (OLAP). Typically, a query to the data warehouse tends to be complex and involves a large volume of data. To keep the data at the warehouse consistent with the source data, changes to the data sources should be propagated to the data warehouse periodically. Be...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000